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TONE MICS, MORPHOTONE MICS, AND TONAL MORPHE MES 


Wm. E. Welmers 
Hartford Seminary Foundation 


Most language students, and even a shocking number of 
linguists, still seem to think of tone as a species of esoteric, 
inscrutable, and utterly unfortunate accretion characteristic of 
underprivileged languages -- a sort of cancerous malignancy 
afflicting an otherwise normal linguistic organism. Since there 
is thought to be no cure -- or even reliable diagnosis -- for 
this regrettable malady, the usual treatment is to ignore it, in 
hope that it will go away of itself. Now,there is undeniably a 
danger of oversimplifying the complexities of tonal structure 
in language: the sophisticate is tempted to be blasé. But atthe 
same time, the presence of tone need not cause us to tremble in 
our scientific boots, or to bury our heads in the sands of "in- 
sufficient data''. In principle, the varieties and functions of ton- 
al contrasts in language are of the same order as the varieties 
and functions of any other contrasts; the problems of tonal an- 
alysis are simply typical problems of linguistic analysis. 


The initial difficuity of tone is probably the same as that 
of stress, length, and some other phenomena such as pharynge- 
alization. If it is once recognized that such phenomena, famil- 
iar to us in the extralinguistic world of sound, may have a dis- 
tinctly linguistic function, the battle is half won. From that 
point on, the theory and procedure of analysis are the same as 
for any other part of a phonologic system. We hear two items. 
Are they perceptually the same or different? Probably no one, 
is "tone deaf" in the sense that he cannot perceive any pitch 
differences. And, if anything, the substitution frame techniques 
outlined by Pike in his Tone Languages make this part of the 
analysis somewhat easier for tone than for consonants or vow- 
els. If two items are different, do they occur in contrast, in 
complementary distribution, or in free variation in terms of 
their phonological environment? That, of course, is a question 
basic to all linguistic analysis. 


There may also be an initial difficulty in labeling tone: 
some students have genuine trouble deciding which way is "up" 


and which way is "down" in pitch, and consistent reversal of 
terminology has been observed. But such labeling difficulties 
require only a little elementary drill -- usually less than is 
required for beginning students of phonetics who cannot identi- 
fy "higher" and "lower" tongue positions in vowel articulation. 


Once these initial difficulties are disposed of, the process 
of demonstrating that a higher and a lower pitch represent dif- 
ferent phonemes in a given language is quite comparable to 
showing that vowel sounds such as and [u] represent differ - 
ent phonemes. But to say that [iJ and [u) are in phonemic con- 
trast, and to give a few examples of each, is to say very Little. 
Full statements of the articulations, the allophonics, the dis- 
tributions, and the morphophonemic alternations of these vow- 
els may be quite complex. And so it is with tone. Displaying 
a few minimal pairs involving tone may tell something most 
convincingly, but it tells only a very little. As a sort of frame- 
work for more comprehensive statements of tonal structure, 
an outline of some of the types of tonal contrasts and the vari- 
eties of their functions in selected languages of the world may 
be useful. This outline is undoubtedly far from exhaustive, but 
perhaps it contains some new information and some fresh view- 
points. 


Pike defines a tone language as "'a language having lexi- 
cally significant, contrastive, but relative pitch on each sylla- 
ble". This definition really says too much, particularly in as- 
sociating pitch with every syllable. I would like to suggest a 
somewhat broader definition, without claiming that it is be- 
yond improvement: a tone language is a language in which both 
pitch phonemes and segmental phonemes enter into the compo- 
sition of at least some morphemes. Like Pike's definition, 
this excludes intonational languages like English. Such lan- 
guages have many morphemes composed of segmental phonemes 
without pitch phonemes, and some intonational morphemes com- 
posed of pitch phonemes without segmental phonemes; but they 
have no morphemes that include both types of phonemes. A 
tone language may conceivably have some morphemes that con- 
tain no pitch phonemes, and certainly many tone languages have 
some morphemes composed exclusively of pitch phonemes. But 
the distinctive characteristic of a tone language is that some of 


its morphemes (usually nearly all of them) contain both seg- 
mental phonemes and pitch phonemes. There seems to be no 
known language in which pitch is significant only for units larg- 
er than a morpheme (words, perhaps), but smaller than a 
phrase. 


The kinds of pitch phenomena that enter into phonemic con- 
trasts are varied; and the recurrence of similar contrasts in 
restricted areas, such as Southeast Asia or West Africa, sug- 
gests that a typological classification may be legitimate and 
useful. Pike classifies tone languages as having "register sys- 
tems" (definable in terms of levels of pitch) or "contour sys- 
tems" (definable in terms of direction of pitch change), andadds 
that some languages are basically of one type with an overlap of 
the other type. Actually, what Pike would call "combinatory 
types" appear to be the rule rather than the exception; the basis 
of his dichotomy may have some validity, but it is not well stat- 
ed. The so-called contour languages of Asia do seem to form 
a typological group. What does seem to be distinctive in most 
of them, and in fewif any other languages, is the existence ofat 
least one unit toneme in which two components are significant: 
the direction of pitch change, and also the position of the entire 
glide within the pitch range of the environment. For example, 
Vietnamese has unit tonemes that must be described as "high 
rising" and "low rising"; Cantonese has "high rising", "low ris- 
ing", "high falling", and "low falling'"'; Peiping Mandarin has 
“high rising" and "complete falling’. The label "contour lan- 
guages" may still be the best for this type, but "contour" is un- 
defined, and mixed types are nearly if not completely excluded. 


Many languages of West and Central Africa, as well as the 
few American tone languages of which I! have any knowledge, 
can be rather simply described as having two, three, or four 
level tonemes (five in one instance), and perhaps also a unit ris- 
ing toneme and/or a unit falling toneme. In most such languages 
(Hausa is an exception), each toneme is restricted to a relative- 
ly narrow range of absolute pitch within a phrase, and these to- 
nemic ranges are discrete throughout the phrase, though they 
may all tilt downward at the very end of the phrase in a brief 
final contour. Thus, in a three-level system, high tone near the 
end of a phrase has virtually the same absolute pitch as a high 


tone anywhere else in the phrase, and is higher than any mid 
tone in the phrase. Usually there are few restrictions in tone 
sequences. These phenomena may be illustrated by the follow- 
ing sentence in Jukun (Takum dialect, eastern Nigeria): dni 

zé sira a syini bt. — — — 'Who brought these 
yams?' While the tones of Hausa sound quite different from 
this, there is reason to assign Hausa to this type rather than 
the next. Without going into detail, the ranges of the two to- 
nemes of Hausa may also be described as discrete if we note 
that they are arranged in the form of parallel downward slopes 
from the beginning to the end of the phrase. Languages with 
such systems may be called "discrete level languages" 


Quite a different type of tonemic structure is found in 
every southern Bantu language I have heard, apparently in 
Kikongo, and also in Efik, Tiv, and some other languages of 
West Africa. Exactly three level tonemes seem to be essen- 
tial to this type, but they can hardly be called "low", "mid", 
and "high" without using distorted definitions. Recurrent low 
pitches in a phrase may be rather easily identified, and rep- 
resent a toneme "'low''. But of the two non-low tonemes, one 
must be described as ''same as preceding non-low", and the 
other as "lower than preceding non-low''; we may abbreviate 
these descriptions to ''same"' and "drop". Now, if these are de- 
fined in terms of a preceding non-low, what about the first non- 
low tone of a phrase? Precisely at that point, these languages 
show no contrast between different non-low tones. After the 
first non-low, by our definitions, the tonemic sequence "drop- 
same" is a sequence of identical pitches assigned to different 
tonemes. Conversely, the tonemic sequence "'drop-drop" is a 
sequence of different pitches assigned to the same toneme. It 
is precisely the sameness or the dropping that is significant. 

In a typical long phrase, an effect of terraced descent is heard. 
Every continuation on a given terrace is tonemically a ''same'"' 
every step down to a new terrace is tonemically a "drop", and 
that level becomes the new point of reference; rising to a pitch 
higher than the preceding non-low is impossible. Thus the 
number of absolute levels of pitch in a phrase is also irrele- 
vant: as many as six are recorded in Shitswa. If a given mor- 
pheme with non-low tone occurs at the beginning of a phrase, 
it has the highest pitch of the phrase; near the end of a phrase 
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it may be several steps down from the highest pitch, though its 
toneme is the same. In the following Shitswa sentences, the 
first non-low is arbitratily identified with "drop"; unmarked 
vowels are analyzed as belonging to the domain of the preced- 
ing tone. Note the difference in pitch, but not in toneme, be- 
tween the two occurrences of the same morpheme /nEn¢g/: 


/vamiwona mifana wahosi ying ng/ 


‘they see the good chief's son' 
Failure to work out this admittedly unusual analysis has 
prompted some remarkable statements about the languages in- 
volved. It has been said that Shona has three tones, but that on- 
ly two of them are "really significant"; the suspension of con- 
trast at the first non-low level seems to have been the stumbling- 
block here. The "drops" which I believe must be analyzed as to- 
nemes have been called ''syntactic conditioning" of tone, and one 
preposterous explanation suggests that these languages have 
"ornamental tone" rather than ''lexical tone"; but numerous in- 
stances of lexically contrastive pairs can be cited. Frequently 
the whole acoustic effect is written off as ''a descending pattern 
of intonation" or ''tone decay''; but each drop is phonemic -- a 
phoneme inherent in some morphemes but not in others. It may 
well be that sad experience with languages of just this type has 
contributed to the inadequate and sometimes confusing state- 
ments of some linguists whom we too readily accuse of lacking 
completeness and rigidity in tonemic analysis. Abraham's anal- 
ysis of Tiv comes close, but even he must resort to footnotes 
to explain different levels of his ''mid'' tone. These languages 
now seem to represent a hitherto undefined type of tonemic 
structure; I have given them the label ''terraced level languages". 


All of this, of course, has to do only with descriptive typol- 
ogy, and not at all with genetic relationships between languages. 
Completely unrelated languages may display the same type of 
tonal structure: an almost incredible similarity exists between 
Fants (West Africa) and Apachean (U.S.). Conversely, closely 


related languages may have different types of tonal structure: 
in a group of mutually intelligible dialects of Dyimini in the 
Ivory Coast, one dialect seems to have a terraced level sys- 
tem, while the others have discrete level systems. Of course, 
it is even more unrealistic to associate the number of tonemes 
with genetic relationship. Of two mutually intelligible dialects 
of Bariba in Dahomey, one has three tones and the other four. 
Mano and Jukun, as distantly related as is possible within a 
single language family, both have three level tonemes. 


A word should be said here about the analysis of glides in 
some non-contour languages. In an unpublished paper, Olmsted 
reported such limited glides as "high falling" and "low rising" 
in Yoruba. However, all of the patterns of Yoruba favor the 
analysis of such glides as sequences of different level tones. 
They occur only with long vowels, which are best interpreted 
as double vowels, and pattern exactly like sequences interrup- 
ted by a consonant. But Yoruba, like many other West African 
languages, does have a unit falling toneme and a unit rising to- 
neme as well, each occurring with short vowels. These are 
quite different from the sequence glides; the end-points are not 
readily identifiable with any of the level tonemes. With double 
vowels, sequences such as falling-plus-mid also occur. The 
analysis of many glides as sequences, usually associated with 
double (i.e., long) vowels, is frequently necessary in West Afri- 
can languages. The presence of double vowels, however, is not 
essential to such an analysis. Jukun has no long or double vow- 
els, and yet has glides that must be analyzed as sequences of 
two or even three level tonemes. In the Wukari dialect of Jokun, 
for example, contrasts like the following occur; none of these 
four utterances takes appreciably longer to say than any other: 


/ka hwé kwi./ 'He bought a chicken.’ 

/kG hwé*kwi./ 'He bought a gourd.' . 

/kt hw@’kwi./ 'It was a chicken he bought.' 
/kG hw®kwi. / 'It was a gourd he bought.' 


If sequences of two or three tonemes can be crowded into 
Simultaneity with a single vowel, it is equally true that, in some 
languages, the domain of a toneme may be more than one "'syl- 
lable"; in fact, the concept of "syllables" tends to be more de- 


ceptive than helpful. In Kpelle (Liberia), for example, the tones 
of CVV and CVCV forms generally parallel those of CV forms, 
and are best interpreted as unit tonemes. Similar tonemic do- 
mains appear in Efik (Nigeria), and are probably very common. 
Such varied domains must, of course, be capable of phonologic 
definition, either in terms of the tone itself, or in terms of 
stress, juncture, or some other phenomenon. 


Morphotonemic alternations, of which Pike describes some 
unusually complex examples in his Tone Languages, seem to be 
a source of particular difficulty and confusion to many analysts. 
A few examples will suffice to show that they are merely typi- 
cal linguistic problems. Completely regular phonologic condi- 
tioning, for example, is found in several alternations in Bariba 
(Dahomey). In one of these,a basic sequence low-mid occurs 
only in final position; it has the alternant low-high before low, 
but low-low before any other tone: 


bit, ‘a child' low-mid in final position 

bit weni ‘that child over there' low-high before low 

bit wi‘ that child' low-low before mid, high, or 
top 


A simple case of morphologic conditioning is found in 
Senari (Ivory Coast). In nouns with all low tones, the last low 
of the stem becomes mid before some suffixes which have low 
tone, but not before others: 


/tegé/ ‘a hoe' /tegi/ ‘the hoe' 
'hoes' /teyi/ 'the hoes' 
/stzelé / 'a basket' /s??el / ‘the basket' 


In this case, the alternation occurs in a large number of mor- 
phemes, but is conditioned by the presence of one of a very few 
morphemes. In Loma (Liberia), Suppire (French Sudan), Ibo 
(Nigeria), and numerous other languages, a long list of mor- 
phemes becomes the conditioning factor. Stems are divided in- 
to two or more morphotonemic classes, distinguished by dif- 
ferent patterns of morphotonemic behavior in adjacent mor- 
phemes. 


A complex of conditioning factors is found in a morphotone- 

mic alternation in Kpelle. High tone occasionally appears as 

an alternant of mid in verbs. The conditioning is phonologic in 
the sense that high tone appears only after high or mid, not af- 
ter low. But it is morphologic in two ways: (1) the alternation 
occurs only in some stems with mid tone; and (2) it occurs on- 
ly in some verbal constructions. Only the last of the follow- 

ing examples shows all three of the pertinent conditioning fact- 
ors; 


right right mid/ 
stem constr. high 


before 

kpete 'they have fixed 

one bridge' no yes no 
dffé gb&woi kpete ni 'they didn't fix 

the bridge' no no yes 
gbdwoi kp¢te ‘they have fixed 

the bridge’ no yes yes 
daa wifru t5n kula 'they have taken 

out one stick’ yes yes no 
dffé kula nf ‘they didn't take 

out the stick' yes no yes 
daa kila ‘they have taken 

out the stick' yes yes yes 


Even in the most complex cases of morphotonemic alterna- 
tion, as those of Suppire, the conditions can be discovered and 
stated by familiar linguistic techniques. There is no need for 
a special jargon such as M. M. Green uses in calling these al- 
ternations ''dynamic or relational tone'’. Her further reference 
to some such tones as ''grammatical" must refer to morphemes 
of tone replacement, which are normal in Niger-Congo languag- 
es. But these are, properly speaking, not morphotonemic alter- 
nations at all. A simple example is provided by Jukun: 


/ka bi/ came' bi/ "he should come' 
bi/ 'I came' bi/ 'I should come' 


Here the inherent tone of the pronouns appears in the ''past"’ 
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forms. The replacement of the tone of any pronoun by high is 
a morpheme signalling the "hortative'' construction. Such re- 
placements of the tones of pronouns, verb stems, or both -- 
signalling verbal constructions -- are extremely common; they 
become fairly complex in Tiv, Kpelle, Bariba, and other lan- 
guages. Similar tone replacives signal the head (second) mem- 
ber of noun compounds in Kpelle, but the attributive (first) 
member in Fante. 


Not all tonal morphemes are replacives. In the sequence 
glides of Jukun, above, low tone is an allomorph of a noun pre- 
fix which appears initially as /a/; and high tone is an allomorph 
of a morpheme indicating identification, which appears initially 
as /&/. Takum Jukun also has two noun suffixes: one has a CV 
shape with low tone, while the other consists of low tone only, 
forming the end of a sequence glide after mid or high. Finally, 
Kpelle has a morpheme whose allomorphs include a series of 
consonant replacives, but which must, in the light of the total 
structure of the language, be described as inherently "prefixed 
low tone". 


In sum, the more information we acquire about even the 
‘ most complex tonal systems, the more encouragement we re- 
ceive that we already have the equipment needed to handle them. 
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SYLLABICS AND PHONETIC CLUSTERS IN GERMAN 


Richard M. Thurber 
University of Illinois 


O. In his Manual of Phonology!Charles F. Hockett men- 
tions three crucial problems in phonemic theory: (1) Phoneme 
versus cluster; (2) multiple complementation; (3) marked and 
unmarked. The first problem, with specific reference to Ger- 
man, will be the concern of this paper. Before we propose a 
tentative solution to this problem, we will review briefly and 
comment on older principles formulated to handle it. After a 
brief description of the phonemic system of Standard High Ger- 
man as set up by Moulton2, we will present our own interpreta- 
tion, based on a statistical analysis, of syllabics and phonetic 
clusters. In conclusion we will discuss briefly possible appli- 
cations of the suggested technique to other languages. 


1. In his textbook Phonemics: A Technique for Reducing 
Languages to Writing,” Pike mentions four basic premises used 
by linguists to analyze the phonemic system of a language, prem- 
ises which Hockett refers to as "older heuristic principles," 
but which he by no means rejects. Of the four, the last has 
probably caused the most doubt and hesitation in phonemic an- 
alysis. In Pike's words (160): '"Charasteristic sequences of 
sounds exert structural pressure on the phonemic interpreta- 
tion of suspicious segments .''4 Pike readily admits that the 
above premise can be difficult to apply. He asks (64) the ques - 
tions: (1) ''How strong, or consistent, must the pressure or 
symmetry be to force a particular interpretation? (2) What 
should be done when different types of structural sequences ex- 
ert conflicting pressures, or when the pressures cannot be 
clearly analyzed as applied to the suspicious items?" He an- 
swers (64) and concludes: ''We do not know, and for precisely 
that reason have differing interpretations of English vocoid 
glides, and the sequence [t¥]. We must wait far further pho- 
nemic theory to clarify these problems. Until such a theory is 
available, we muSt expect to find differing solutions equally 
valid within our present premises, even though we previously 
stated that we assume that, ultimately, only one accurate anal- 
ysis can be made of any one set of data." 


1.1. To overcome the difficulties mentioned by Pike, 
Hockett reintroduces Bloomfield's hierarchic approach by 
working out an immediate constituent technique on the phono- 
logical level. As illustrat.on he presents 161-2 first a typi- 
cal (Pikean) discussion of the status of [c) in Fox, and then his 
own statement in terms of IC's. Having finished the analysis 
according to Pike's suggestions, Hockett (162) concludes: "It 
is hard to be sure what conclusion would be drawn from the 
above considerations: it would depend in part on the prefer- 
ences of the analyst. The writer's own preference, within the 
limitations of the 'phoneme or cluster' approach, is to settle 
for a unit phoneme /¢/"'. Hockett suggests further that if all 
the pertinent data and the possibilities have been examined, 
the answer "it doesn't matter'’ would not be bad. 


1.1.2. After using the IC approach on the same data (Fox), 
Hockett (164) concludes: ''We see that the question 'is /f/a 
phoneme or a cluster?' is too narrow and not properly fruitful: 
a more productive question to ask is 'on what hierarchic level, 
or levels, does the given element function,and at what stage, 
working down the IC scale, does it break into smaller constitu- 
ents?"' He illustrates his conclusion further by his treatment 
(164) of English /f/ and /Z/: ''Perhaps we must regard /&/ and 
/Z/ as something like close or intimate sequential clusters of 
/t/ and /¥/, /d/ and /%/. in contrast to the 'normal' (though 
much rarer) clusters /t’ d%/. This would involve a recognition 
of two kinds of sequential arrangement (intimate and 'normal'), 
not contrasting in most cases." 


1.2. Instead of hierarchic levels within hierarchic levels 
(with any number of phonemic levels within a broad phonemic 
level as contrasted with the morphological, syntactical, or high- 
er levels), this paper proposes a technique whereby such a prob- 
lem can be solved. To do this we assume a phonological level 
composed of one stratum. In applying the particular statistical 
technique, we follow Pike's dictum (160 Fn.) that "...our estab- 
lishment of phonemic principles and procedures must ultimately 
rest upon our observation of native reactions to the phonetic 
data.""11 Further, we believe that ''the informants' unconscious 
physical, linguistic or social reactions to the structural unity 
of his phonemic system may be analyzed by the observer."" At 
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the same time the proposed method avoids difficulties of con- 
flicting criteria in phonemic analysis. For instance, vocoid 
glides could be interpreted in German either as a unit or a 
cluster, depending on the analyst's preference. Moulton inter - 
prets them as sequences on the basis of patterning in syllable 
peak-coda position. Vocoid glides could equally well be treat- 
ed as units on the basis of patterning in strong verbs, i.e., 
most tense changes in strong verbs involve the replacement of 
one unit by one unit, so that "pattern congruity" forces the in- 
terpretation of glides as one unit (in leiden, litt, gelitten, the 
glide /ai/ would be considered a unit by analogy with trinken, 
trank, getrunken). 


2. In setting up a tentative phonemic system for German, 
we use essentially the scheme presented by Moulton, who in 
turn based his interpretation, with some modifications, onthe 
phonetic data given by Vietor.> The system is as follows: 


Consonants: 
Stops: p t k 
b d g 
Vv zz 
Resonants: l 
m n 5 
Flap: r 
Vowels: i: y: u: 
u 
e: ¢: oO: 
> 
A 
a: 
Diphthongs: ai, au, 3i 


Moulton explains in detail his differences from Vietor's 
analysis.° At the end of his article he concludes, contrary to 
Bloomfield, Trubetzkoy and his own earlier article, that the 
vocoid glides are twg units and that long vowels are sequences 
of two short vowels.’ Moulton does not discuss the suspicious 
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pair [ts], but merely writes it /ts/. 


3. We now turn to the solution of four problems in estab- 
lishing the phonemic pattern of German: (1) The status of the 
long, tense, decentralized vowels i:, e:, u:, o:, y:, $:, a: (for 
convenience hereafter referred to as long vowels). Are they 
units or sequences of a short vowel plus something else?; 

(2) The vocoid glides au, ai, 5i as units or sequences; (3) The 
affricates [ts] and [pf] as units or clusters; (4) The phonemic 
status of -1, -m, -n, -r in unstressed syllables, i.e., should 
they be interpreted as /-el, -em, -en, -er/ or must we posit 
four new phonemes of syllabicity? The last problem has ap- 
parently never been discussed in connection with German, ex- 
cept by Trubetzkoy, though Pike and Swadesh, among others, 
have discussed it in connection with English.8 


3.1. In the following analysis we adopt a different concept 
of distribution than is common in phonemic analysis by asking 
the question: Do the specific words (here free forms, minimal 
or not), in which the phonetic data in question occur, behave as 
if they belong to a group having n phonemes, or to a group hav- 
ing n + 1 phonemes per word? As a concrete example: Does 
the free form die [dij] belong to the group of words having two 
phonemes or to the group having three phonemes? If it can be 
demonstrated that die belongs to the former group, then we 
would have to interpret it as /di:/; if it belongs to the latter 
group, it would be /dii/ or /di plus something else/. 


3.2. The source used here is F.D. Kaeding's Haufigkeits - 
w8rterbuch der deutschen Sprache (Berlin 1897). The latter ,is 


a word count based on nearly eleven million words of running 
texts taken mostly from written sources but comprising about 
eight per cent of spoken texts (parliamentary debates). The 
different words (about 258,000) were extracted from this cor- 
pus and recorded with their frequencies. Originally compiled 
as an aid to establishing stenographic systems, Kaeding's list 
has also been used to prepare minimum vocabulary lists for 
pedagogical purposes. Kaeding defined a word as anything that 
appears between spaces on the printed page. A different word 
is anything that has a different shape from any other word in 
the traditional German orthography. Thus the free forms gehe, 


geht, geht's, gehen are four different words. Evidently Kaeding 
employed a rather primitive and for our purposes seemingly 
useless criterion. However, in the vast majority of cases 
Kaeding' s words correspond to what may be called free forms, 
pronouncable in isolation. Although he did not distinguish be- 
tween homonyms, their role in the following statistical analy- 
sis in unimportant. Furthermore the words used in the follow- 
ing analysis are the 751 most frequent in all of Kaeding's texts 
and most probably are the most frequent words in the spoken 
language. They are the "grammar" words: conjunctions, pro- 
nouns, particles, prepositions, and the most common nouns and 
verbs. 


3.2.1. Those words occurring a thousand times or more in 
Kaeding's list, comprising 55.9% of the eleven-million-word 
text, are used here. The 751 words were deemed sufficient to 
yield reasonably reliable statistical results. Because the phone- 
mic status of long vowels, vocoid glides, affricates, and sylla- 
bics was in doubt, all words in which any of the above phonetic 
data occurred were eliminated from the original 751 words and 
re-grouped into four classes corresponding to the four problems 
to be treated. Thus [ha:ben] or [ha:bn] was eliminated because 
it contains a long vowel anda syllabic. The remaining words 
not containing the ambiguous phonetic data were used as a mas- 
ter list and divided into groups according to whether they con- 
tained 2, 3, 4, etc. phonemes (there were no one-phoneme 
words). We proceeded from what was assumed to be known to 
the unknown. The results for the master list are shown in 
Table I below. 


Two facts in connection with the table should be pointed 
out. First, it will be noticed that the length of a word in terms 
of phonemes is decidedly a function of frequency: the lengthin- 
creases as the frequency decreases. The requirements of an 
efficient coding system are met. Second, the distribution of 
each group is extremely skewed, i.e., in each group the differ- 
ences between the most frequent and the least frequent words 
is relatively great; thus the arithmetic mean and the median do 
not coincide. In the two-phoneme group the frequencies range 
from 188,078 to 4,559. Ordinarily in a skewed distribution the 
median is considered the most typical representative of the dis - 
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tribution rather than the more familiar arithmetic mean. Nev- 
ertheless we have chosen the mean as the representative of 
each group so that the extreme values receive due emphasis. 
The median is totally insensitive to wide distributional varia- 
tions. Which measure of central tendency to use is a matter of 
debate. In any case, however, the results as far as the present 
investigation is concerned are the same. 


Table 1: Master List 
Phoneme Group Average Frequency No.of Words 


58 ,338 9 
35,825 43 
5,897 74 
3,207 54 
2,454 32 
1,857 13 
1,631 4 
2 

2 

2 

2 
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2,245 
10 1,610 
1] 1,887 
12 1,275 


3.3. Having set up the master list, we turned to the first 
problem, the interpretation of long vowels. We made the fol- 
lowing assumption: one long vowel equals one phoneme. All 
words containing long vowels but no other ambiguous phonetic 
data were grouped in the same manner as was the master list. 
The results are shown in Table 2. The next problem is to 
match the averages of the master list with those of the long-' 
vowel list under two assumptions: (1) One long vowel repre- 
sents a sequence of two phonemes. Under the second assump- 
tion each phoneme group in Table 2 would be advanced one unit, 
so that there would be no group of two-phoneme words and the 
last group would contain ten-phoneme words. To match the 
long-vowel list with the master list in two ways, a nonparamet- 
ric (distribution free) technique was chosen for the following 
reasons: (1) We need not specify conditions about the paramet- 
ers of the population from which the averages were drawn, oth- 
er than that the observations are independent and that the vari- 


ables under study have underlying continuity; (2) We are deal- 
ing with data drawn from several different populations -- each 
phoneme group is a sample from a stratified parent population. 
To use the parametric t-test for the differences between means, 
a normal distribution of each population would have to be as- 
sumed and each group compared separately. The result would 
be a series of probabilities that would be difficult to interpret. 


Table 2: Long Vowel List 


Phoneme Group Average Frequency No.of Words 


2 53,335 13 
3 6,631 65 
4 2,515 48 
5 2,586 31 
6 1,766 24 
7 1,794 8 
8 2,182 8 
9 1,263 3 
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3.4. The nonparametric test best suited to the data is the 
Kolmogorov-Smirnov two-sample test, which tests whether 
two independent samples have been drawn from the same popu- 
lations with the same distributions.? For the present analy- 
sis, the two-tailed test is chosen to decide whether there is 
any kind of difference in the distributions from which the two 
samples were drawn -- differences in central tendency, in dis - 
persion, etc. It should be noted that we are concerned with the 
agreement between two sets of sample values, not between the 
distribution of a set of sample values and some specified dis - 
tribution. The master list does not represent a theoretical 
norm. It has been compiled from the strata of an unknown pop- 
ulation, and we are attempting to determine whether another 
series of strata from an unknown population corresponds to it 
best under one assumption or the other. 


3.4.1. The Kolmogorov-Smirnov two-sample test is con- 
cerned with the agreement between two cumulative distribu- 
tions. If our sample of averages has been drawn from the 


same population distribution, then the cumulative distributions 
of both samples should be fairly close, showing only random 
deviations from the unspecified population distribution. A cum- 
ulative frequency distribution for each sample of observations 
is made, using the same intervals for both distributions. For 
each interval one step function is subtracted from the other. 
The test is concerned with the largest of the differences. 


3.4.2. In mathematical terms, Sn,(X) is the observed cu- 
mulative step function of one of the samples, i.e., Sn}(X) = K/n), 
where K is the number of values equal to or less than X. The 
cumulative step function of the other sample series is Sn>(X), 
l.e., Sn2 = K/nz. The Kolmogorov-Smirnov two-sample test 
focuses on D = maximum Snj(X) - Sn>(X) for the two-tailed 
test. A one-tailed test would be used if we were interested in 
finding out whether the population values from which one of the 
samples was drawn are stochastically larger than the popula- 
tion values from which the other values are drawn. Since we 
are here interested only in determining whether the two samples 
were drawn from different populations, we use the two-tailed 
test, i.e., we determine the maximum difference regardless of 
sign. The data on the comparison between the master list and 
the long-vowel list will serve as an example of the technique. 
For the three other problems, only the results will be given. 


3.5. The first assumption to be tested, and which will be 
our null hypothesis (H,), is that long vowels represent one pho- 
neme. For reasons that will be evident shortly, the frequen- 
cies will be expressed in thousands, instead of the original 
values. Since the long-vowel list contains no words longer than 
nine phonemes under the null hypothesis, the groups in the mas- 
ter list containing ten and more phonemes are omitted. The 
sum of the average in the master list is 111.4 thousands, of the 
long-vowel list 72 thousands. These sums will be our n, and 
nz respectively. Table 3 shows the cumulative distribution of 
the original values; Table 3.1 shows the decimal equivalent of 
the distribution: Given the values of n, and n32, the probability 
of a deviation as large as .217 occurring by chance may be 
found in Smirnov's tables by first computing the value z, where 
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z= D 10 In the present example, z = .217 = 
111.4 72.0 


1.435. For z = 1.435, Smirnov's tables give a probability be- 
tween .967 and .968 that the above D did not occur by chance. 
In other words, the fit is poor and suggests that the two sam- 
ples come from different populations. 


Table 3: Cumulative Sum of Averages of Each Phoneme Group 
of Master and Long- Vowel Lists: 


2. 3 4 5 6 7 8 9 
5111.4 (4) 58.3 94 
S7> o(X) 53.3 59 


9 62.2 65.0 66.8 68.6 70.8 72.0 


Table 3.1. Decimal Equivalents of Data in Table 3 


2 3 4 5 6 9 


S72 o(X) .740 .832 .867 .903 .928 .953 .983 1.000 


5111.4(%)-572 - 
217 .012 .030 .023 .020 .012 .004 .000 


D = maximum Sn - Sn2(X) 


3.6. We now turn to the second hypothesis (Hj): that long 
vowels represent two phonemes. Because under this hypothe- 
sis there would be no two-phoneme group in the long- vowel 
list, the two-phoneme group average is dropped from the mas- 
ter list and the ten-phoneme group is added. The cumulative 
distribution in decimal form is shown in Table 3.2. below: It 
is clear that H), namely that long vowels consist of two pho- 
nemes, yields a much better fit. Thus by our analysis, based 
on a different concept of distribution, we arrive at the same 
conclusion as did Multon. 


= 


Table 3.2. Cumulative Distribution of Phoneme Group Aver- 
ages under Hj 


S72.9(X) .740 .832 .867 .903 .928 .953 .983 1.000 
S54 .7(X) - S72.9(*) - 

085 .069 .046 .036 .026 .022 .012 .000 
D = .085, z = .474, P = .02 


3.7. The analysis of vocoid glides, the affricates, and the 
syllabics are summarized briefly below. 


3.7.1. Vocoid glides as one phoneme: z = .635, P = .35. 
Vocoid glides as two phonemes: z = .162, P = .00. Conclusion: 


vocoid glides best treated as sequences of two phonemes. 
Again we agree with Moulton's conclusion. 


3.7.2. Affricate [ts] as one phoneme: z= .04, P = .00. 
Fs) as two phonemes: z = 1.24, P= .91. Conclusion: evi- 
ence strongly favors unit interpretation. For [pf] the 751 
words from Kaeding's compilation yielded only three examples. 
The D for the unit interpretation is .276; for the interpretation 
as two phonemes, D is .116. Though the evidence is meager, it 
points to a two-phoneme interpretation. 


3.7.3. Syllabics as one phoneme: z = 1.27, P = .76; syllab- 
ics as two phonemes: z = .23, P= .00. Conclusion: syllabics 
should be counted as two phonemes /-el-en-em -er/. : 


3.8. A few points should be noted concerning the mathe- 
matics: (1) If the original figures had been used instead of the 
reduction to thousands (111,600 instead of 111.6, etc.), the prob- 
abilities of any comparison yielding a good fit would have been 
sharply reduced, i.e., the D's would have been enormously sig- 
nificant in all instances. In this study we have been interested 
in finding which of two alternatives yields the better fit. If we 
had used the original figures, it would have been awkward math- 
ematically to evaluate the differences between any pair of D's 
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(2) The number of words in each phoneme group in each list 
was deemed adequate to yield a reliable enough average upon 
which to set up the various distributions. Though in the mas- 
ter list, for instance, some group averages are based on only 
two items (the phoneme groups 8-12 show averages that are 
doubtless too high since only the ''tops"' of these groups are in- 
cluded in the first 751 most frequent words), the general dis- 
tribution of the averages would not be changed significantly by 
using a larger list. Note also that Group 2 of the master list 
contains only nine items, but that these nine items include well 
over half of all the two-phoneme words in the 258,000 words in 
Kaeding's list. (3) An alternative solution would have been to 
run a correlation analysis for each problem, using the aver - 
ages of each group as the paired variates. This was done and 
the results are exactly the same. It is interesting to note that 
r varies from .880 for the worst fit to .999 for the best. The 
correlation coefficient is typically about .92 for the bad fit and 
about .98 for the good fit. The good fit then is in all cases bet- 
ter than the bad fit.’ However, the Kolmogorov-Smirnov test 
involves no assumptions of linearity of regression and is eas- 
ier to compute. 


4. Our analysis results in the following inventory of linear 
phonemes for German: Consonants: p,t, k, b, d, g, f, s, 8, x, 
h, v, z, 2, ¢, l, m, n, 4, r. Vowels: a,e,i,o,u,y, d. The in- 
ventory corresponds to Moulton's analysis except for /¢/and 
his /a/. It is not the purpose here to debate the pros and cons 
of one interpretation over another of the same evidence. For 
instance we can accept Moulton's analysis of long vowels as 
sequences of two short vowels rather than as a short vowel 
plus something else, although he states that other interpreta - 
tions are possible. A statistical analysis, at least the one pre- 
sented above, can only express how many, not what units are 
involved. If we must assume that an utterance in any language 
is made up of a whole number of units only, then it may be that 
a statistical analysis is an adequate method for determining the 
size of any linguistic unit. It follows also that if the above 
statistical procedure is valid, then any language may be subject- 
ed to the same type of analysis. Thirty or forty hours of spoken 
text (roughly one hundred thousand words) should be more than 
adequate. Even a tenth of this size would yield hints as to the 
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best interpretation. 


4.1. One question might be raised by the critical reader: 
What if the analysis for any given language shows that no sta- 
tistical difference can be demonstrated for one hypothesis or 
the other? If the definition of distribution given in this paper 
is correct, then the answer would be: "It doesn't make any dif- 
ference." The decision would rest with the preference of the 
analyst, precisely as it often does in more traditional defini- 
tions of distribution. It would perhaps be interesting to exam- 
ine data from a number of languages in which similar prob- 
lems occur to see whether ambiguous results are achieved. 

The writer doubts whether this would happen. 
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PHONOLOGICAL ASPECTS OF WELSH-ENGLISH 
BILINGUALISM 


Robert A. Fowkes 
New York University 


In the field of Celtic things often move slowly. And, while 
caution has not always been a conspicuous feature in that rug- 
ged terrain, innovation is likely to be greeted with more thana 
modicum of suspicion. Thus it is that phonemics has created 
no tremendous stir in the Celtic countries themselves, with the 
partial exception of Brittany, where proximity to France has 
had some effect. Most studies that apply phonemic method to 
the Celtic languages have originated outside those countries. 


The title of the present paper! is intentionally ambiguous, 
for the word phonological permits the discussion to concern it- 
self with both phonemic and non-phonemic aspects of the sound 
system. To restate in phonemic terms what is usually said or 
implied (in phonetic terms) in the grammars and handbooks, the 
Welsh language apparently has the following phonemes: 


Ipbtdkgmnytl 


There are three degrees of stress and three degrees of vowel 
length. [5] and [€} also occur but are never long and seem to be 
allophones of /o/ and /e/, although there may be some debate 
on that score, depending on regional considerations. (The pres- 
ent paper is based on what may be called standard Welsh). 

Shwa occurs long only once,as the name of the last letter of the 
alphabet (orthographically: y), which is by nature a specialcase. 
The phonetic range of the various phonetic ''realizatioins" of 
this phoneme is extremely wide. 


There will be immediate objection to this phonemic picture 
of the Welsh sound system. Somebody who has heard Welsh 
spoken or who speaks it himself will protest the omission of 
several items; he will assert the existence of additional "sounds" 
and he will, of course, be right. Many of these will be allophon- 
ic in character, but others will prove to be something more. For 
this initial representation is purposely based on that portion of 


the Welsh vocabulary that may in traditional terms be called 
inherited, plus the loanwords of the earlier strata (Latin, Old 
English, Middle English). In other words, the products of sub- 
sequent English lexical invasion are, for the time being, ig- 
nored, although the results of that invasion have been far-reach- 
ing and must ultimately be taken into account. 


The Welsh phonemic system, as given here, lacks /& {% z/. 
There are, of course, innumerable other theoretically conceiv- 
able elements that are not present in Welsh, but the absence of 
certain ones is more pertinent, for various reasons, than that 
of others. Furthermore, other sounds (whether phonemes or 
allophones is, at the moment, irrelevant) do not occur in cer- 
tain positions; [l,v, w,@], for example, do not occur initially 
except under certain conditions of sandhi. In traditional Celtic 
terminology, these are not found as "radical" initials. Despite 
this state of affairs, bilingual Welshmen, when speaking Eng- 
lish, encounter no difficulty in uttering [€, J, %, z] in any posi- 
tion and are equally untroubled by [1, v, w, 8] in initial posi- 
tion. Such ''violation"’ of Welsh phonemic patterning is obvious - 
ly irrelevant in English, even in the English of Welshmen, 
since, as is generally recognized, a given phoneme exists ina 
given language, and an individual speech act occurs in a specif- 
ic language.“ It would be very difficult to sustain the thesis 
that the phonemic inventory of bilingual speakers consists of 
the sum of two phonemic systems.” Rather, a bilingual must 
have two separate inventories, with a certain measure of pho- 
netic resemblance. This resemblance is not wholly fortuituous, 
for the phonic stock of one inventory may at times spill over 
into that of another. 


Now it happens that the most frequent family name in Wales, 
Jones, has, as its initial phoneme, a sound theoretically absent 
from the language. Other family names contain /¢/, /2%/, etc., 
despite the fact that native Welsh words do not have such pho- 
nemes. Most Welsh family names are of English or Norman or- 
igin, since the old Celtic method of naming was based ona 
patronymic system of a different type and did not normally re- 
sult in the equivalent of family names. Welsh has managed to 
incorporate these English and Norman names without difficulty 
into its own word stock. It cannot plausibly be assumed that a 


Welshman is lapsing into English every time he says John 
Jones or Richard Richards in the course of a Welsh conversa- 
tion. In fact, a Welshman hearing a name like Richard Rich- 
ards will often respond in some such fashion as "Dear me, 
that is a Welsh name," and the phonic sequence Richards will 
constitute a portion of his utterance felt to be especially 
Welsh. 


And even those native speakers of some sophistication who 
cogitate on such matters unwittingly give significant clues as to 
what is relevant or not in the sound system of these loans. A 
few years ago the writer heard, in an official adjudication at an 
eisteddfod in Wales, a sharp condemnation of the way the o was 
pronounced in the name Jones: "There is no [ou] in Welsh; we 
must pronounce it as a 'pure' vowel'' was the substance of the 
rebuke. But the adjudicator was unperturbed at the fact that 
there is not only no fou] in Welsh but also no j, no z, no clus- 
ter nz (or even ns) in final position, etc. In fact it is, strictly 
speaking, not even possible to spell the name Jones in Welsh-- 
assuming that such a situation might arise--because there is 
no name for the letter j. It clearly must be admitted, then, 
that the Welsh system has acquired a new phoneme /j/. And, 
while words like jel- ‘jail’, jam, jar, jwg 'jug' (with a plural 
jygiau as evidence of complete morphological adaptation to the 
Welsh system) may conceivably (not inevitably) be recognized 
as loans from English, Welsh family names are certainly not 
so perceived by native speakers. It may also be pointed out 
that no Welshman seems to worry about writing or printing 
Jones with a j, and a Welsh dictionary that presents the alpha- 
bet at the beginning, sans J, can blithely have a few columns,of 
Welsh words beginning with that character. Hence there is 
graphic accommodation to match the phonemic acquisition. 


There was a period when the acceptance of the phoneme 
/}/ was not immediately accorded, and a substitute sound /8/ 
or /sy/ was employed instead. The retention at the present 
time of forms with these substitute sounds alongside of the 
forms later introduced with /¥/ results in doublets. Hence we 
find Sion, Sian, Siam, Sieffre, Sior, Siarlys beside John, Jane, 
James, Jeffrey, George, Charles, with a preference for the 
former group in first names. Interestingly enough, the substi- 
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tute sound /3/ was itself not a phoneme at the time of the in- 
troduction of these loans; it seems to have been at the most a 
positional variant of s before y, but it quickly assumed pho- 
nemic status, which it now retains without doubt. The ques- 
tion arises whether 4 was an allophone to begin with also. This 
is somewhat more difficult to answer. It is apparently not pos- 
sible to date precisely the origin of } for dy in words like jawl 
‘devil' (a form with a special flavor and connotation), beside 
diawl, diafol (from Lat. diabolus). Very likely this allophone 
came into being after the arrival of English loanwords with 
/j/. At any rate, its phonemic status is certainly the result of 
those English loans. /é/, on the other hand, seems to be an in- 
stance of a phoneme imported wholesale, for there is no Welsh 
phoneme or sequence of which € is a variant. In names like 
Richards, Charles, etc., its phonemic status is clear. Rhisiart 
is a doublet showing the same substitution as was seen in the 
case of . above, and common nouns often show another substi- 
tution, namely ts, cf. [starts] 'starch', [matsis] 'matches' 
(with a '"'singulative" [matsen)), [wits] 'witch', etc. Once more, 
the ersatz sequence was not one originally occurring in Welsh. 


Names like Edwards and Richards have (dz), which is no 
doubt to be designated as a cluster or sequence, rather than a 
phoneme. At least there is no readily available proof of pho- 
nemicity. A new cluster has thus been introduced into the’ 
language via such names. One member of the cluster was not 
originally present as a phoneme itself, namely the /z/. This 
entered Welsh possibly in the first instance through biblical 
names, which abound in that phoneme. Therefore, even though 
there is supposedly no z in the Welsh alphabet, names like the 
following have become naturalized in the language of the chapel 
and Sunday school, institutions which have exerted an influence 
on Welsh cultural life that can hardly be exaggerated: Ezra, 
Ezeciel, Zecharian, Zebedeus, Zorobabel, Mizraim, Gazah, etc. 
Zion has the competing forms Seion, Sion, and some of the oth- 
ers have doublets with s for z. The word zel 'zeal', which is 
not of the same biblical origin although certainly closely con- 
nected with the same group of words, has the doublet sel. The 
Bible has, incidentally, also been the source of numerous other 
names with the phoneme /j/, discussed above; cf: Jacob, Job, 
(beside Siob, lob, lo), Jeremiah, Jonah, Joshua, Jehofah, Jona- 
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than, Japheth, Joctan, etc. 


The phoneme /Z/ has been introduced ona more restricted 
scale by far. It nevertheless occurs in such words as [in%an] / 
[inZin} (again with a competing form [injan]), and in the speech 
of some people it can be heard in words like [televi%on] (de- 
spite the attempts to propagate a new coinage darlledu); others 
use (%) in such words, which may indicate that the phonemic 
status of 4 is somewhat dubious in the words in -sion (borrow- 
ings from English, obviously). 


In view of the preceding remarks it is plainly necessary to 
revise the phonemic inventory as first--and tentatively--given 
and to posit now the following phonemes for Modern Standard 
Welsh: 
Double underlining indicates the added phonemes. It was not 
clear at first whether to regard ts as a single phoneme /c/ (e.g. 
in Roberts) or to label it a sequence; certain parallelism in dis - 
tribution to ds, dz, however, makes it likely that there are two 
phonemes involved. 


To some extent these additions to the phonemic stock of 
Welsh constitute moves in the direction of symmetry. Since the 
opposition of voiced and voiceless plays a vital role in Welsh 
phonology (cf. the pairs p:b, t:d, k:g, f:v,@:3) the addition of % 
and z filled in two gaps in the pattern by providing the voiced 
counterparts of & ands. And the acquisition of €, } further re- 
inforces this type of opposition (whether the two were added 
simultaneously or at different periods). Significantly enough, 
however, these relatively new phonemes have not yet made their 
way into the native word stock in any conspicuous measure; on- 
ly such instances as jawl for diawl as noted above can be ad- 


duced, and these are only sporadic. But it may be expected, and 
even predicted, that native words will become increasingly vul- 
nerable to assault, at first on the allophonic level, and may ul- 
timately acquire some of these phonemes themselves. The 
chronological layer to which a given vocable historically be- 
longs is, after all, usually a matter of indifference to the 
speaker. 
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On the non-phonemic level, once more, there has been the 
introduction of words with the initial sounds (l-), (v-), (w-), 
(6-). These sounds cannot occur in a so-called non-mutated 
or radical form of a Welsh word. They occur only under cer- 
tain morphophonemic conditions as modifications of, or re- 
placements of other sounds. They are not now allophones of 
the sounds they replace, although this was once almost cer- 
tainly the case. As the result of the acquisition of certain loans 
from English, these sounds are now "permitted" initials. Inter- 
estingly enough, they do not usually participate in the process 
of consonantal mutation but constitute virtually unalterable in- 
itials. Examples of this type of loan are: 

with (1-): lamp, lili ‘lily’, leicio 'like', lein ‘line’, lemon, 
lwe ‘luck’, lyfli 'lovely' (one of the most fre- 
quent adjectives in spoken Welsh!) 

with (v-): ficer [viker] 'vicar', f6t (vo:t) 'vote', Vaughan, 
F yrsil 'Vergil', etc. (f in Welsh orthography = 
(v)) 


with (w-): Williams, witsio 'bewitch', wercws ' workhouse, 


poorhouse' 
with (9-): theater, thesis, theistiaeth 'theism', Thybawd 
‘Theobald’. 


A somewhat enigmatic example of initial (l-) is that found 
in the name Lloyd. This is one of the comparatively few family 
names of Welsh native origin, and yet it is made to behave pho- 
netically as if it were derived from English. To begin with it 
was an adjective llwyd ftuid] ' gray', and this is still the Mod- 
ern Welsh word for Peay" . When occurring as an adjective it 
has the expected phonetic value of its initial, namely a voice- 
less lateral fricative, becpming (l-) only under the pertinent 
conditions of sandhi. It almost seems as if, along with an orth- 
ographic change (from Lhuyd, Llwyd, etc. to Lloyd), which gave 
a more English aspect to the form of the name, a thorough An- 
glicization took place, including repression of the initial pho- 
neme or replacement of /#/, by /1/, despite the contradictory 
retention of the Welsh orthographic feature 1l-. It may also 
be true that speakers are not usually aware of the meaning of 
the name. (Such a famous name in Wales as Lloyd-George vio- 
lates the inherited phonemic system in both its members!) A 
possible explanation is that there was once a period in which 
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speakers realized that family names in Wales were, in the 
main, loanwords, especially when preceded by the obviously 
foreign titles Mr. and Mrs. At such a time it would not have 
been inconsistent to treat Lloyd as an English name and to pro- 
nounce its initial accordingly. Today, as has been stated, Welsh 
names are not regarded as English by native speakers except 
by those pedantically aware of their source. But the mystery 
of the form Lloyd increases when it is recalled that no such 
process occurred in the name Llewelyn, which retains its in- 
itial%. Or is the appearance of this name such as never to be 
confused with anything English? 


The word ficer [viker] caused some trouble for a while af- 
ter its introduction into Welsh, since the initial v-sound was 
sometimes interpreted as being a lenited consonant. There was 
analogical attempt to restore its assumed "radical" initial.4 
But the source of v- as a lenited sound is ambiguous, since both 
m- and b- become v-. Hence there were formerly competing 
forms micar and bicer, before the one with (v-) won out. In 
some other instances it was the analogical form which swept 
the field; the Welsh word for 'verb',for example, is [berv|, de- 
spite its learned origin (or possibly because of that origin, if 
we want to assume learned tampering, a familiar enough proc- 
ess). The family name Vaughan shows, once more, the reten- 
tion of a form with lenited initial, being generalized from a con- 
text in which the form with v- would have been justified. It is 
the adjective by chan (baxan) ' little’ and could regularly occur 
with initial (v-) in some such context as ''John the Small" or 
the like.5 It does, though, add one more vocable to the number 
of those now occurring with what was once an "impossible" in- 
itial in the language. 


Up to now we have considered cases in which the influence 
of English caused alterations in the phonemic system or at sub- 
phonemic level in Welsh. But bi-lingualism may not only be 
the cause of change; it may conceivably also have a preserva- 
tive function and promote the retention of features accidentally 
shared by the two languages. It seems possible, although no 
specific proof can be offered, that the preservation of the voiced 
and voiceless dental fricatives in Welsh [@ ,6 ] may be due to 
their simultaneous continued existence in English. This con- 
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trasts with the situation in Breton, e.g., where the sounds once 
occurred but have now both become [z). Did the absence of 
such sounds in French have anything to do with their inability 
to maintain themselves in Breton? And was the specific 
course of their development determined to any degree by 
French? 


Moreover, Welsh initial voiceless stops [p, t, k] are char- 
acterized by strong aspiration, as are the corresponding sounds 
in English. (The articulatory position of t- is, admittedly, dif- 
ferent from that of English t- ). In Breton this aspiration 
seems very weak or absent. Here again comparison with 
French is obvious. Whether this is entirely a matter of bilin- 
gualism or is a manifestation of "areal" linguistic processes 
is hard to say. Perhaps the basis of many areal linguistic 
phenomena is to be sought in bilingualism to begin with, rather 
than in some mystical influence of proximity. The spread of 
features or their retention as a result of contact of languages 
can, after all, occur only in languages as spoken by people. 


This paper has concentrated on the phonological effects of 
English upon Welsh. Another study could be made of those op- 
erating in the opposite direction. The most conspicuous effect 
might be found in intonation. But phonemes foreign to English 
are introduced from Welsh, especially in place names and per- 
sonal names. Llangollen and Llewelyn will ordinarily be pro- 
nounced with Welsh phonemes by bilinguals speaking English 
and even many monolingual speakers have acquired Welsh pho- 
nemes, just as they may use Welsh intonational patterns in 
their English without knowing how to speak Welsh. Very rare- 
ly are Welsh speakers of English troubled by English conso- 
nant clusters and other phonic sequences, despite the absence 
of these in Welsh. But there are cases in which this difficulty 
does arise, nevertheless. A conspicuous case is that of the 
sequence semi-vowel plus homorganic vowel (it, ul), which can- 
not occur in Welsh--or does not. Thus many a bi- -lingual will 
have trouble with English utterances like the year, wood, wom- 
an, a state of affairs which often baffles English teachers in 
Wales who may know some Welsh and who realize that the in- 
dividual sounds of these words all occur in Welsh but fail to 
grasp the systemic difficulty inherent in the specific combina- 
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tions. Here is a case where arrangement of familiar phonemes 
causes greater trouble than the mastery of entirely new individ- 
ual phonemes. 


Other phases of Welsh phonological influence on the Eng- 
lish spoken in Wales are obscured by historical problems. 
Sometimes a Welsh origin may be attributed to a phenomenon 
that is actually a preservation of an earlier stage of English 
(non-diphthongal pronunciation of 6, €, etc.). Others require a 
careful analysis of regional varieties of Welsh before defini- 
tive comment can be made. One thing seems clear, however: 
regardless of how a phonic or lexical addition to a language oc- 
curs, it cannot be considered as an isolated phenomenon of 
"borrowing" but must be recognized as exerting an influence 
upon the system, the structure of the language concerned. 


1. This paper was originally presented, in somewhat different 
form, at the Annual Conference of the Linguistic Circle of 
New York, on May 10, 1958. 

2. Cf. Uriel Weinreich, Languages in Contact, New York, 1953, 
p.7, also J. Lotz, "Speech and Language," Journal of the Ac- 
oustical Society of America 22(1950)712. 

3. This has been maintained, however, by reputable linguists, 
although the opposite assertion--that systems co-exist, rath- 
er than merging--seems more tenable. See the discussion 


in Languages in Contact, 8-9. 

4. See R. A. Fowkes, "Initial Mutation of Loanwords in Welsh," 
Word 5(1949)205-13, esp. pp. 208, 210. 

5. See C. w. Bardsley, A Dictionary of English and Welsh Sur- 
names, London, 1901, p. 780 s.v. Vaughan. 

6. What occurred was something more than an adjustment in 
the articulatory and acoustic habits of Welshmen, witha 
readiness to accept certain sounds in environments where 
they were not previously encountered. There was also in- 
volved a transformation in the way speaker and hearer re- 
acted to those sounds with respect to implication and inter - 
pretation. They had now to refrain from reacting in what 
would previously have been virtually automatic fashion (re- 
garding forms like lamp, theater, etc. as implying vocative 
or objective relationship, negation, and the like--syntactic 


. 


situations which "mutation" could denote). Thus even one lex- 
ical item like lamp could precipitate processes extending far 
beyond the phonic. 
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A NOTE ON PRESENT DIALECT STUDIES IN IRAN 


Earnest R. Only 
Tehran, Iran 


The contemporary study of Persian dialects by native 
scholars is weil illustrated by six publications which have ap- 
peared in Tehran in the last six years.! Because of the basic 
similarities in approach and treatment, these publications are 
taken as representative of the point of view from which Iranian 
linguists are tackling the problem of recording their country's 
dialects .2 


Iranian scholars should be uniquely qualified, as compared 
to western students, for this task. They start with at least two 
important assets, a thorough knowledge of the standard lan- 
guage, and perhaps a dialect, and access to the dialect speakers. 
Shortcomings of considerable magnitude appear to be the lack 
of an adequate phonetic, not to mention phonemic, description 
and a complete unfamiliarity with modern methods of linguistic 
analysis. 


The interest of the Iranian scholars has been almost com- 
pletely lexicographical - the compilation of glossaries and dic - 
tionaries. Frequently this effort is directed at recording vocab- 
ulary items which appear to be markedly different from SWP. 
At this point it may be noted that all dialect equations are with 
SWP and not with any form of spoken Persian. Connected ma- 
terial is seldom recorded, and morphology, if mentioned at ail, 
is a brief and inadequate grammatical sketch. As we shall see 
shortly, the inadequacies of the transcriptions make it impos - 
sible to detect many phonetic and phonemic distinctions which 
may be present. 


The material recorded, however, is not without interest. 
Behd., for example, records the Gabr dialect used by the Iran- 
ian Zoroastrians and includes terminology for Zoroastrian rit- 
ual objects, personal names, shrines, public places and mar- 
riage ceremonies. Gil. is thorough in its coverage of birds, 
particularily water fowl, and nautical terminology. Three line 
drawings are of particular interest; a sketch of a stilted house 
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labeled with the terminology of the various parts, a drawing of 
a Caspian boat with all the parts named, and a sketch of the 
winds of Gilan indicating the compass points from which they 
blow and the Gilaki names for the winds. Keri. deals with the 
interesting Tat dialect. 


There is, of course, no information on stress, juncture or 
intonation. Lar. and Keri. give grammatical sketches. Lar. 
also includes two and a half pages of text in the Arabic alpha- 
bet, accompanied by a Latin transcription and a Farsi trans- 
lation. Behd. has some specimens of Zoroastrian verse also 
in the Arabic alphabet, Latin transcription and Farsi transla- 
tion. Keri. includes an etymological vocabulary, a list of fifty 
dialect words accompanied by the cognates or supposed cogna- 
tes in other Iranian and Indo-European languages. 


Commonly, no attempt is made to limit a study to a single 
community. Gil. records forms from 21 villages, Lar. from 
six, Kerm. and Behd. from two each. Only Asht. and Keri. 
deal with a single community. Although the place of occur- 
rence of any given vocabulary item is consistently recorded, 
only rarely does the entry indicate whether or not that particu- 
lar form occurs in other villages included in the study or wheth- 
er a different form is used. 


A major drawback, but one not confined to these particular 
studies, is the lack of a reverse index.3 Only Asht. has an in- 
dex of SWP forms accompanied by a page number referring to 
the dialect word. Keri. is a Persian-Kerigani vocabulary with- 
out a dialect index. 


The major deficiency in the methods used by the Iranian 
scholars is the lack of a meaningful transcription. The all- 
too-familiar confusion between writing and speech appears to 
be basic to all the work. The Arabic alphabet with its rules of 
fit which was borrowed to write Persian has in turn been trans - 
ferred to the dialects. Christensen's remarks on "une tran- 
scription dont l' exactitude est limitée par L' insuffisance des 
moyens d'expressions de l'alphabéte Arab" still holds good.4 


In partial recognition of the inadequacies of this method, 
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each dialect word is customarily accompanied by a transcrip- 
tion in the Latin alphabet. The results are somewhat less than 
satisfactory. There is no reason, of course, why a phonetic or 
phonemic alphabet based on the Arabic could not be devised. 
This has not been done, however, nor has the Latin alphabet 
been used adequately. 


Frequently a graphemic distinction is maintained among 
such shapes as zal, ze, zad, and za or se, sin, sad. These dis- 
tinctions cannot be considered as allographic, however, as their 
occurrence cannot be defined from within the writing system 
but only by reference to the SWP equivalent. The Latin tran- 
scription uses a simple one-to-one correspondence between 
the Latin and the Arabic alphabets without any further attempt 
at phonetic description. Thus pe is equated with (p) , te and ta 
with <t), be with <b , etc. The vowel system records only the 
"Long" vowels by means of ' wzlef, vav, or yod for <a?, <u?, and 
<i? respectively. Usually in an initial position ' #@lef is de- 
scribed as written for (a> and ' # lef mae ddi for (47or<aa). 
In actual practice ' @ lef initial is used for ¢a> or <0?. Occa- 
sionally an effort is made to representa sound which is unfamil- 
iar to Persian ears. Gil. writes ye vav and equates it with (up, 
Lar. writes <G? with no Arabic alphabet equivalent and descri- 
bes it as the French ¢t} Lar also writes (06>, again no equiva- 
lent in the Arabic alphabet, and describes it as French ¢eu>. 


Only one investigator, Kia in Asht. omits the Latin tran- 
scription and attempts a consistent, though inadequate, repre - 
sentation using the Arabic alphabet. He avoids most allographic 
variations and has devised symbols for (approximately) [a}[e)[o) 
u)[i)[y) (Ww). However, again no phonetic descriptions are pro- 
vided except for equations with SWP or, in the case of [w], with 
English. 


The state of affairs just described results, for example, in 
Gil. using seven vowel signs in its Latin transcription, compared 
to the sixteen vowels used by Christensen in his phonetic tran- 
scription of the Resht dialect,5 or the nine vowel "phonem" re- 
ported by V.I. Zavyalova® and T. N. Pahalina.? 


This brief note has been intended to call attention to the 


work that is being done in Iran on Persian dialects as well as 
to indicate some of the obvious shortcomings which will be 
found when trying to use this work. Unfortunately the Iranian 
scholars who are engaged in studying their dialects seem to be 
out of touch with linguistic developments in both Europe and 
the United States. Much labor has obviously been put into the 
task. The pity is that the methods used should produce such 
unsatisfactory results. 


Although dialects promise to persist longer in Iran than in 
some other areas, the spread of education, radio-listening and 
the press threaten the same adulteration and eventual extinc- 
tion as elsewhere. 


B. Dorn's old work "'Beitrage zur Kenntniss der irani- 
schen Sprachen, I! Theil, Masanderanische Sprache," St. Pet- 
ersburg, 1860, has just been republished in facsimile in Tehran. 
I have also been told that a substantial portion of the Koran in 
Mazanderani has been found and will be published. 


1. The following are the works on which this note is based: 
Lar. - Ahamd Eqtesadi, Farhange Larestani, Farhange Iran 
Zamin, no. 1, Tehran, 1334. 

Asht. - Sadeq Kia, Guyeshe Ashtiani, Tehran University 
Publications, no. 384, Tehran 1335. 

Ker. ~ Manuchehr Sotudeh, Farhange Kermani, Farhange 
Iran Zamin, no. 4, Tehran, 1335. 

Beh. - Jamshid Sorush Sorushian, Farhange Behdinan, 
Farhange Iran Zamin, no. 3, Tehran 1335. 

Keri. - Yahya Zoka, Guyeshe Keringan, Towards Learning, 
no. 2, Tehran, 1332. 

Concerning the terminology used in this note: When 
speaking of the symbols of Standard Written Persian (SWP), 
I name the letters according to the system used in the Per- 
sian Reader, Foreign Service Institute, Washington, D. C., 
1957. The standard square brackets to enclose phonetic 
symbols and pointed brackets for writing symbols are also 
used. 

2. Il use "Persian" here in the geographic sense of languages 
spoken within the boundries of Iran rather than in the strict 
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linguistic sense of Persian as an Iranian language contrast- 
ed with Luri, Kurdish, Gilaki, Mazanderani etc. 

William Bright's remarks on the necessity of a reverse in- 
dex in works of this kind are appropriate here as well as to 
American Indian languages; see his review of Longacre's 
Proto-Mixtecan, Lang. 34 (1958), 166. 

Arthur Christensen, a la dialectologie Irani- 
enne, 1, Copenhagen, 1930, 10. 

Ibid., 38. 

V.I. Zavyalova, ''Novye Svedeniya po phonetik iraniskih 
Yazykov," Trudy Instituta Yazykoznaniya, VI, 1957, 94. 
T.N. Pahalina, Sovremenniya Iran, Moscow, 1957, 75. 
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THE IDENTITY THEOREM 


D. L. Olmsted 
University of California, Davis 


Recent developments! in the theory of morphemic and 
syntactic analysis have made important use of classes of pre- 
viously segmented items (e.g. morphs, morphemes). In some 
cases these classes have been defined in terms of environ- 
ments; items having identical or similar distributions are as- 
signed to the same class for purposes of further analysis. 


One difficulty inherent in such formulations resides in the 
terms ‘identical' and 'similar'. Identifying an ‘identical’ dis- 
tribution is an empirical matter; criteria for 'similarity' have 
yet to be specified in such a way that the utility of such criteria 
can be judged in other than an ad hoc fashion for each language. 
Yet general theories of grammatical analysis continue to pre- 
scribe rules for the manipulation of such classes as if they had 
been defined by the more rigorous criteria of identical distri- 
bution rather than by vaguer notions of similar distribution. 


The purpose of this note is to show that such classes are 
never in fact defined in terms of identity of distribution and 
thus to focus attention on the necessity of defining 'similarity'. 


The usual formulation is stated most succintly by Hockett2: 
"We define the set E(S) to include all the environments 
occupied by any t € T(S) within any other substretch 
in T(S). The range of a given substretch t, symbol- 
ized R(t), is the set of all environments E€ E(S) in which 
t occurs. A set of substretches which includes, with 
a given substretch t, all substretches t’ such that 
R(t') = R(t), and no others, constitutes a form-class 
$ of substretches. By definition, form- 


classes are obviously pairwise disjunct (that is, no 
substretch belongs to more than one form-class)." 


We disagree flatly with this formulation and introduce the 
following theorem, to be called the Identity Theorem: 


is 


Theorem: Any two items x and y which have exactly the 
same ranges R(x) = R(y) are not two different items but are 
identical: If R(x) = R(y) then x *y. 


Proof: (by counter example). Assume x and y are dif- 
ferent items. Then each occurs in an utterance defining it 
uniquely e.g. 'x is a z', 'y is a w®. Therefore the utterances 
"y is a z' and 'x is a w' are non-occurrent (i.e. impossible). 
Ergo x and y do not have the same ranges R(x) 7 R(y). Thus 
if x is different from y, then R(x) f R(y), which implies our 
theorem by elementary logic ("if not q, then not p"' is equiva- 
lent to "if p, then q’'.) 


1. C.F. Hockett, A formal statement of morphemic analysis, 

SIL 10.2.27-39 (1952). 

Z.S. Harris, Methods in Structural Linguistics, Chicago 

(1951) 160: ''We therefore restrict the application of 12.22 

by saying that we will consider particular tentatively inde- 

pendent phonemic sequences as morphemic segments only 

if it will turn out that many of these sequences have identi- 

cal (emphasis mine, DLO) relations to many other tentative- 
ly independent phonemic sequences." 

R.S. Wells, Immediate Constitutuents, 12, Lg. 23.2.87 

(1947). 

Hockett, op. cit. p. 32-3. 

‘is a z' and 'is a w' may of course be as long as is neces- 

sary uniquely to define the items - anything from a morph 

or two up to a twenty-volume ' handbook" on the subject. 

4. "Mentalistic'' comment: if it is impossible to discover an_ 
environment in which x occurs but y does not - and vice 
versa - then x and y have no differential "communication 
value," i.e. are ''the same morph," ''the same word" or the 
like. Items are ''different'' precisely because they can be 
differentiated, and they can be differentiated precisely be- 
cause no two "'different'' ones have identical ranges. 
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